Speech Recognition and Information Retrieval:
نویسندگان
چکیده
The Informedia Digital Video Library Project at Carnegie Mellon University is creating large digital libraries of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. These digital video libraries allow users to explore multi-media data in depth as well as in breadth. The Informedia system automatically processes and indexes video and audio sources and allows selective retrieval of short video segments based on spoken queries. Interactive queries allow the user to retrieve stories of interest from all the sources that contained segments on a particular topic. Informedia will display representative icons for relevant segments, allowing the user to select interesting video paragraphs for playback.
منابع مشابه
ارائه یک روش جدید بازیابی اطلاعات مناسب برای متون حاصل از بازشناسی گفتار
In this article a pre-processing method is introduced which is applicable in speech recognized texts retrieval task. We have a text corpus, t generated from a speech recognition system and a query as inputs, to search queries in these documents and find relevant documents. A basic problem in a typical speech recognized text is some error percentage in recognition. This, results erroneously ass...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملSpeech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents
The Informedia Digital Video Library Project at Carnegie Mellon University is making large corpora of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. Information retrieval of from corpora of speech recognition output is critical to the project’s success. In this paper, we out...
متن کاملExtra large vocabulary continuous speech recognition algorithm based on information retrieval
This paper presents a new two-pass algorithm for Extra Large (more than 1M words) Vocabulary COntinuous Speech recognition based on the Information Retrieval (ELVIRCOS). The principle of this approach is to decompose a recognition process into two passes where the first pass builds the words subset for the second pass recognition by using information retrieval procedure. Word graph composition ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کامل